Maximum Correlation Analysis (MCA; Bretherton et al., 1992) is similar to Empirical Orthogonal Function Analysis (EOF) in that they both deal with the decomposition of a covariance matrix. In EOF, this is a covariance matrix based on a single spatio-temporal field, while MCA is based on the decomposition of a "cross-covariance" matrix derived from two variables. The resulting expansion coefficients (ECs) associated with the left and right singular vectors can then be projected onto the original data to obtain homogeneous and heterogeneous regression maps.
This notebook applies MCA method to real data—the tropical Pacific sea level pressure (SLP) and sea surface temperature (SST) fields—and identify the coupled patterns from the data, including the famous tropical Pacific climate variability known as the El Nin o–Southern Oscillation (ENSO). ENSO has warm El Nino states and cool La Nina states, with changes found not only in the SST but also in the SLP.
The MCA approach is chosen because it is able to capture patterns of maximum covariance between two variables; it has been found to reasonably capture atmospheric and oceanic processes (Wilks, 2015). It is a robust method to investigate dominant modes of interaction, because it favors a better understanding of the relationship between groups of variables (Frankignoul et al., 2011).
The SLP and SST data are downloaded from http://www.esrl.noaa.gov/psd/gcos_wgsp/Gridded/data.hadslp2.html and http://www.esrl.noaa.gov/psd/thredds/catalog/Datasets/kaplan_sst/catalog.html, respectively.
In [1]:
%matplotlib inline
import numpy as np
import xarray as xr
import cartopy.crs as ccrs
import datetime as dt
import matplotlib.pyplot as plt
import matplotlib as mpl
import matplotlib.gridspec as gridspec
import matplotlib.dates as mdates
mpl.rcParams['figure.figsize'] = 8.0, 4.0
mpl.rcParams['font.size'] = 13
In [2]:
ds1 = xr.open_dataset('data/slp.mnmean.hadslp2.nc')
slp = ds1.slp.sel(lat=slice(30, -30), lon=slice(180, 290), time=slice('1950-01-01','2005-12-31'))
lon_slp = ds1.lon.sel(lon=slice(180, 290))
lat_slp = ds1.lat.sel(lat=slice(30, -30))
dates = ds1.time.sel(time=slice('1950-01-01','2005-12-31')).values
# climatology
slp_clm = slp.groupby('time.month').mean(dim='time')
# anomaly
slp_anom = slp.groupby('time.month') - slp_clm
In [3]:
ds2 = xr.open_dataset('data/sst.mon.anom.kaplan.nc')
sst_anom = ds2.sst.sel(lat=slice(-30, 30), lon=slice(180, 290), time=slice('1950-01-01','2005-12-31'))
lat_sst = ds2.lat.sel(lat=slice(-30, 30))
lon_sst = ds2.lon.sel(lon=slice(180, 290))
In [4]:
slp2d = slp_anom.values
ntime, nrow_slp, ncol_slp = slp2d.shape
slp2d = np.reshape(slp2d, (ntime, nrow_slp*ncol_slp), order='F')
sst2d = sst_anom.values
ntime, nrow_sst, ncol_sst = sst2d.shape
sst2d = np.reshape(sst2d, (ntime, nrow_sst*ncol_sst), order='F')
nonMissingIndex = np.where(np.isnan(sst2d[0]) == False)[0]
sst2dNoMissing = sst2d[:, nonMissingIndex]
In [5]:
Cxy = np.dot(slp2d.T, sst2dNoMissing)/(ntime-1.0)
U, s, V = np.linalg.svd(Cxy, full_matrices=False)
V = V.T
Normalize MCD mode 1 by standardizing the EC1, so patterns correspond to a 1-std deviation variation in EC1. The positive EC1 of SST should match the Nino SSTA.
In [6]:
scf = s**2./np.sum(s**2.0)
In [7]:
# SLP MCA pattern
U1 = np.reshape(U[:,0, None], (nrow_slp, ncol_slp), order='F')
# EC1 of SLP
a1 = np.dot(slp2d, U[:,0, np.newaxis])
# normalize
U1_norm = U1*np.std(a1)
a1_norm = a1/np.std(a1)
In [8]:
# SST MCA pattern
V1 = np.ones([nrow_sst*ncol_sst,1]) * np.NaN
V1 = V1.astype(V.dtype)
V1[nonMissingIndex,0] = V[:,0]
V1 = V1.reshape([nrow_sst,ncol_sst], order='F')
# EC1 of SST
b1 = np.dot(sst2dNoMissing, V[:,0, np.newaxis])
# normalize
V1_norm = V1*np.std(b1)
b1_norm = b1/np.std(b1)
In [9]:
plt.plot(np.cumsum(scf),'x')
plt.xlabel('SVD mode')
plt.ylabel('Cumulative squares covariance fraction')
plt.ylim([0.7,1.1])
plt.xlim([-0.5, 40])
Out[9]:
In [10]:
gs = gridspec.GridSpec(2, 2)
gs.update(wspace=0.1, hspace=0.15)
fig = plt.figure(figsize = (14,10))
levels = np.arange(-1.0, 1.01, 0.05)
# SLP Pattern
ax0 = fig.add_subplot(gs[0,0], projection=ccrs.PlateCarree())
x1, y1 = np.meshgrid(lon_slp, lat_slp)
cs = ax0.contourf(x1, y1, U1_norm,
levels=levels,
transform=ccrs.PlateCarree(),
cmap='RdBu_r')
cb=fig.colorbar(cs, ax=ax0, shrink=0.8, aspect=20)
ax0.coastlines()
ax0.set_global()
ax0.set_extent([-180, -70, -19, 19])
ax0.set_title('Normalized SLP MCA Mode 1')
# SST Pattern
ax1 = fig.add_subplot(gs[0,1], projection=ccrs.PlateCarree())
x2, y2 = np.meshgrid(lon_sst, lat_sst)
cs2 = ax1.contourf(x2, y2, V1_norm,
levels=levels,
transform=ccrs.PlateCarree(),
cmap='RdBu_r')
cb=fig.colorbar(cs, ax=ax1, shrink=0.8, aspect=20)
ax1.coastlines()
ax1.set_global()
ax1.set_extent([-180, -70, -19, 19])
ax1.set_title('Normalized SST MCA Mode 1')
# EC1
ax2 = fig.add_subplot(gs[1,:])
ax2.plot(dates, a1_norm, label='SLP')
ax2.plot(dates, b1_norm, label='SST')
r = np.corrcoef(a1[:,0], b1[:,0])[0, 1]
ax2.set_title('Expansion Coefficients: SFC = '+ str(round(scf[0],2)) + ', R = ' + str(round(r,2)))
ax2.legend()
ax2.set_ylim([-4,4])
ax2.format_xdata = mdates.DateFormatter('%Y')
fig.autofmt_xdate()
The above figure shows the first MCA mode calculated between the monthly anomaly fields of Sea Surface Pressure (SLP) and Sea Surface Temperature (SST) for the region of the equatorial Pacific. Mode 1 explains a large portion of the squared covariance (93%) and shows the strong large scale coupling between SLP and SST anomalies (e.g. areas with positive MCA values in SLP appear to coincide with negative SST). This is what we would expect of the El Niño Southern Oscillation (ENSO) - High SLP anomalies in the western Pacific is typical of the warm El Niño phase, which results in warmer SST anomalies throughout the equatorial Pacific. The opposite La Niña phase results from low SLP in the western Pacific.
Allan, R., and T. Ansell, 2006: A New Globally Complete Monthly Historical Gridded Mean Sea Level Pressure Dataset (HadSLP2): 1850-2004. J. Climate, 19, 5816-5842.
Kaplan, A., M. Cane, Y. Kushnir, A. Clement, M. Blumenthal, and B. Rajagopalan, Analyses of global sea surface temperature 1856-1991, Journal of Geophysical Research, 103, 18,567-18,589, 1998
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, Édouard Duchesnay. Scikit-learn: Machine Learning in Python, Journal of Machine Learning Research, 12, 2825-2830 (2011)
Stéfan van der Walt, S. Chris Colbert and Gaël Varoquaux. The NumPy Array: A Structure for Efficient Numerical Computation, Computing in Science & Engineering, 13, 22-30 (2011), DOI:10.1109/MCSE.2011.37
Fernando Pérez and Brian E. Granger. IPython: A System for Interactive Scientific Computing, Computing in Science & Engineering, 9, 21-29 (2007), DOI:10.1109/MCSE.2007.53
John D. Hunter. Matplotlib: A 2D Graphics Environment, Computing in Science & Engineering, 9, 90-95 (2007), DOI:10.1109/MCSE.2007.55
Hoyer, S. & Hamman, J., (2017). xarray: N-D labeled Arrays and Datasets in Python. Journal of Open Research Software. 5(1), p.10. DOI: http://doi.org/10.5334/jors.148
Bretherton CS, Smith C, Wallace JM (1992) An intercomparison of methods for finding coupled patterns in climate data. J Clim 5:541–560
Frankignoul C, Chouaib N, Liu Z (2011) Estimating the observed atmospheric response to SST anomalies: maximum covariance analysis, generalized equilibrium feedback assessment, and maximum response estimation. J Climate 24(10):2523–2539. doi:10.1175/2010JCLI3696.1
Wilks DS (2015) Multivariate ensemble model output statistics using empirical copulas. Q J R Meteorol Soc 141(688):945–952
In [ ]: